feat(trial): zero-friction URL-to-workspace onboarding MVP#758
Merged
simple-agent-manager[bot] merged 35 commits intomainfrom Apr 21, 2026
Merged
feat(trial): zero-friction URL-to-workspace onboarding MVP#758simple-agent-manager[bot] merged 35 commits intomainfrom
simple-agent-manager[bot] merged 35 commits intomainfrom
Conversation
Lays groundwork for /try — shared types (Valibot), DB migration 0043 (system user sentinel + trial_waitlist table), wrangler TRIAL_COUNTER DO binding (v7 migration) + trial env vars, trial services (HMAC-signed cookies with constant-time compare, KV kill-switch with 30s cache + fail-closed, discovery prompt), 501 route stubs under /api/trial/*, TrialCounter DO with atomic transactionSync increment/decrement, frontend Try/TryDiscovery stubs mounted at /try + /try/:trialId, operator docs at docs/guides/trial-configuration.md, and 43 unit tests covering cookie round-trip/tamper/expiry, kill-switch cache/TTL/fail-closed, and TrialCounter cap enforcement. Trials remain disabled by default (kill-switch fails closed) so this is safe to deploy without setting TRIAL_CLAIM_TOKEN_SECRET. Wave 1 will wire the live create/events/claim/waitlist handlers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements backend lifecycle for zero-friction trial onboarding (Wave 1
Track A):
- trials table + sentinel-installation workaround (migration 0044)
- TrialCounter DO: fetch surface + tryIncrement/prune RPC methods
- POST /api/trial/create with Valibot validation, kill-switch gate,
GitHub repo probe (size/privacy), DO slot allocation, and
counter-decrement rollback on D1 failure
- GET /api/trial/status with fail-closed fallback when DO throws
- POST /api/trial/waitlist with lowercase-email dedupe via
onConflictDoNothing(email, resetDate)
- Three scheduled modules wired into cron dispatch:
- trial-expire: 5-min sweep marks expired trials
- trial-rollover: monthly DO pruning (0 3 1 * *)
- trial-waitlist-cleanup: daily notified-row purge (0 4 * * *)
- All configurable via DEFAULT_* constants + env overrides (Principle XI)
- 92 new behavioral tests covering resolution branches, DO RPC surface,
fallback semantics, cookie issuance, and fail-closed error paths
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Builds the frontend components that gate the trial experience behind GitHub auth — a chat input with suggestion chips for anonymous users, and a login sheet that opens when they send their first message. Integration into TryDiscovery (SSE streaming `trial.idea` events) lands in wave-2 alongside the live /claim handler. Components - ChatGate: autogrowing textarea + horizontally-scrolling chip row; Cmd/Ctrl+Enter submits, Enter inserts newline; disabled state when empty/whitespace; surfaces submit errors without clearing the draft - LoginSheet: responsive dialog (mobile bottom-sheet, desktop centered modal) with Escape/backdrop/close-button dismissal, focus trap between primary CTA + close, body scroll lock, return-to URL construction (trialId URL-encoded, ?claim=1 sentinel) - SuggestionChip: 44px-tall touch target with title + optional summary, aria-label compose, disabled state Hooks - useTrialDraft: per-trialId localStorage draft with 400ms debounce (flush-on-unmount), synchronous writes when debounceMs=0, rehydrates on trialId change, no-ops with undefined trialId - useTrialClaim: idle → claiming → submitting → done/error state machine; injectable claim/submit fns for testing; StrictMode-safe (single claim per mount); clears draft only on successful submit; preserves projectId when submit fails so UI can retry Harness + tests - TrialChatGateHarness at /__test/trial-chat-gate (public, not linked from nav) renders ChatGate + LoginSheet with query-param-driven mock data (ideas=0..20, long=1, auth=1, loginOpen=1) so Playwright can capture screenshots without hitting the real claim flow - 43 new unit tests across components + hooks covering rendering, interactions, persistence, error states, focus management - 13 Playwright visual scenarios at 375x667 + 1280x800: empty state, 1/5/20 chips (page-level overflow asserted false — chip row owns its horizontal scroll), long-text wrapping, anonymous send opening LoginSheet, bottom-sheet vs centered-modal layouts, 44px touch targets on send button + suggestion chips Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire trial onboarding backend so the post-OAuth claim flow and the
per-trial event stream work end-to-end.
- TrialEventBus DO: in-memory ring buffer (MAX_BUFFERED_EVENTS=500)
with long-poll /poll, /append with terminal-event auto-close, /close,
waiter-wake semantics. Configurable via TRIAL_EVENT_BUS_DEFAULT_POLL_TIMEOUT_MS.
- trial-store service: KV-backed writeTrial/readTrial/markTrialClaimed
with 3-key indexing (by trialId, by projectId, by fingerprint).
- trial-runner: mode-aware config resolution (staging=opencode+workers-ai,
production=claude-code+anthropic); production requires ANTHROPIC_API_KEY_TRIAL.
startDiscoveryAgent creates chat + ACP session with discovery prompt.
emitTrialEvent/emitTrialEventForProject append to TrialEventBus best-effort.
- GET /api/trial/:trialId/events: fingerprint-cookie-authenticated SSE.
Verifies trial record + HMAC signature + UUID match (fails closed on
any mismatch). Heartbeat every TRIAL_SSE_HEARTBEAT_MS (default 15s);
long-poll DO every TRIAL_SSE_POLL_TIMEOUT_MS; max duration
TRIAL_SSE_MAX_DURATION_MS. Closes on terminal event.
- POST /api/trial/claim: auth-required; verifies HMAC claim cookie;
atomic D1 UPDATE with WHERE userId=TRIAL_SENTINEL_USER_ID precondition;
clears claim cookie; returns {projectId, claimedAt}. Returns 409 on
UPDATE-changes=0 race.
- OAuth callback hook (maybeAttachTrialClaimCookie): on 2xx/3xx response
from /callback/github, if a valid fingerprint cookie maps to an unclaimed
non-expired trial, sign a claim token, set sam_trial_claim cookie, and
rewrite Location to https://app.${BASE_DOMAIN}/try/:trialId?claim=1.
- Env + wrangler binding for TRIAL_EVENT_BUS Durable Object.
70 new unit tests (6 files) cover DO long-poll/waiter-wake/terminal-close,
SSE auth-failure matrix + happy path, claim route 400/404/409/200 branches,
oauth-hook bail-out matrix + rewrite happy path, trial-runner config
resolution + error paths, and trial-store round-trips.
Replaces Wave 0 stubs with full trial discovery flow: - Try landing page with GitHub URL validation + error branches (invalid_url, repo_private, trials_disabled, cap_exceeded, existing_trial) - TryDiscovery streams SSE events (started, progress, knowledge, idea, ready) with exponential backoff reconnect (max 5 retries) and renders repo header, progress, knowledge graph, ideas, and workspace-ready CTA - TryCapExceeded page with waitlist email capture + inline validation - TryWaitlistThanks confirmation page - trial-api client: createTrial, joinWaitlist, openTrialEventStream - ChatGate stub placeholder for Track D integration Tests: - Vitest component tests for Try + TryCapExceeded (11 cases: URL validation, success nav, existing-trial resume, each error branch, email validation, waitlist submit, API error) - Playwright visual audit at 375x667 and 1280x800 covering landing, discovery (streaming/ready/empty), cap-exceeded, waitlist-thanks, and all inline error states — overflow asserted on every test Mobile-first with design tokens; 56px primary CTA, 44px secondary touch targets; env(safe-area-inset-*) padding. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… integration Resolves conflict in ChatGate.tsx by keeping Track D's real implementation; adapts TryDiscovery to Track D's ChatGate contract (TrialIdea shape, onAuthenticatedSubmit handler that navigates to the claimed project chat with the message staged in sessionStorage).
… kill-switch Previously, self-hosters had to manually run `wrangler secret put TRIAL_CLAIM_TOKEN_SECRET` and `wrangler kv key put trials:enabled true` before the /try flow would work on staging. Wire both into the standard deployment pipeline so staging trials are live out of the box. Changes: - infra/resources/secrets.ts: add `trial-claim-token-secret` RandomId resource (32 bytes base64) + export `trialClaimTokenSecret` Pulumi output, same persistence pattern as encryptionKey / jwtPrivateKey. - infra/index.ts: re-export the new output. - scripts/deploy/configure-secrets.sh: read trialClaimTokenSecret from Pulumi state and set it as a required Worker secret on every deploy. - .github/workflows/deploy-reusable.yml: add a staging-only step that sets KV `trials:enabled=true` via wrangler after the worker deploys. Production stays opt-in per spec (operator flips the flag manually when ready to accept live trial traffic). - docs/guides/trial-configuration.md: document the automation — no more manual secret-put or kv-put steps for staging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`wrangler kv key put` writes to remote by default; --remote is not a valid flag for that subcommand and caused the staging deploy's trial kill-switch step to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…olve it
Track A (create.ts) inserted trial records into D1 only; Track B readers
(events.ts, claim.ts, trial-runner.ts) all look trials up via
trial-store.readTrial() which reads from KV. The result: every SSE
connection 404'd with "Trial not found or expired" seconds after the
trial was created.
Integration fix:
- create.ts calls writeTrial() after the D1 insert, with projectId=''
(Track B's orchestrator rewrites the KV record once the project row
exists). On KV failure, roll back the D1 row and release the
TrialCounter slot so we don't burn a cap entry.
- writeTrial() skips the trial-by-project index when projectId is
empty, preventing all pending trials from colliding on
`trial-by-project:`.
- events.ts: use errors.notFound('Trial') — previous argument produced
doubled "Trial not found or expired not found".
Added a regression test asserting writeTrial is invoked from the happy
path (captures the exact KV put) so this bug cannot silently recur.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
Author
Staging verification update — trials automation + integration fixTwo follow-up commits landed after the initial PR: 1. Deploy automation (commits
2. Wave 1 integration bug fix (commit
Staging verification evidence (run
Updated configuration checklist for @raphaeltm:
Production deploy and merge remain deferred per your instructions. |
…760) * task: move trial-orchestrator-wire-up to active Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(shared): add trial orchestrator timing/retry constants Introduce DEFAULT_TRIAL_ORCHESTRATOR_* and DEFAULT_TRIAL_KNOWLEDGE_* constants used by the alarm-driven TrialOrchestrator DO and the fast-path GitHub knowledge probes fired from POST /api/trial/create. Every value is env-var overridable (Constitution Principle XI). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(trial): add TrialOrchestrator DO binding, env vars, sentinel installation - Declare TRIAL_ORCHESTRATOR DO binding + v9 migration in wrangler.toml - Extend Env interface with TrialOrchestrator/Knowledge tuning knobs and TRIAL_ANONYMOUS_INSTALLATION_ID override - Migration 0045 seeds the system_anonymous_trials_installation sentinel row so anonymous trial projects can satisfy the NOT NULL + FK constraint on projects.installation_id without owning a real GitHub App install The DO class itself is added in the next commit. * feat(trial): add TrialOrchestrator DO state machine Adds the alarm-driven TrialOrchestrator Durable Object (one per trialId) that replaces the fire-and-forget `waitUntil(provisionTrial())` pattern with a resumable state machine. Module layout mirrors TaskRunner: - types.ts — TrialOrchestratorStep union + persisted state shape - helpers.ts — re-exports TaskRunner helpers; adds sentinel-user / sentinel-installation resolvers + safeEmitTrialEvent. - steps.ts — per-step handlers (project_creation, node_selection, node_provisioning, node_agent_ready, workspace_creation, workspace_ready, discovery_agent_start, running). - index.ts — DO class: start(), alarm() dispatch, backoff retry, overall-timeout guard, trial.error emission on failure. Each step emits `trial.progress` at entry so the SSE stream reflects where the orchestrator is. Terminal `running` step is idle — the ACP bridge (wired separately) is responsible for emitting `trial.ready` after the discovery agent produces its first assistant turn. All timing/retry knobs read from env vars with DEFAULT_* fallbacks (Constitution Principle XI). Adds two new optional env fields: TRIAL_VM_SIZE and TRIAL_VM_LOCATION for trial-specific VM overrides. Exports the class from apps/api/src/index.ts so the Workers runtime can instantiate it via the TRIAL_ORCHESTRATOR binding (already declared in wrangler.toml v9 migration). Task: tasks/active/2026-04-19-trial-orchestrator-wire-up.md * feat(trial): bridge ACP/MCP events into trial SSE stream Adds a dedicated `services/trial/bridge.ts` module with three helpers that hook into existing hot paths and fan qualifying events out as `trial.*` SSE events: - bridgeAcpSessionTransition: `running` → trial.ready (with workspaceUrl derived from BASE_DOMAIN + workspaceId), `failed` → trial.error. - bridgeKnowledgeAdded: fires trial.knowledge when the discovery agent adds a knowledge observation via MCP. - bridgeIdeaCreated: fires trial.idea with a summary-clipped excerpt when the discovery agent creates an idea via MCP. All three helpers short-circuit on non-trial projects after a single `readTrialByProject(env, projectId)` KV lookup, so normal (non-trial) project traffic only pays that one extra KV read on qualifying events. Hook sites: - ProjectData DO `transitionAcpSession` — dynamic-imports the bridge and dispatches after the transition succeeds, guarded by `if (projectId)` and wrapped in try/catch so bridge errors never block the transition. Casts `this.env` through unknown to the worker-scope Env because the DO's local Env type is intentionally narrow. - `handleAddKnowledge` MCP handler — dispatches after addKnowledgeObservation. - `handleCreateIdea` MCP handler — dispatches after the DB insert. Every dispatch is fire-and-forget; bridge errors are already caught inside each helper but the call sites add a second try/catch for defense. Task: tasks/active/2026-04-19-trial-orchestrator-wire-up.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(trial): wire TrialOrchestrator + GitHub knowledge into POST /api/trial/create Adds two fire-and-forget dispatches after the trial record is written and before the HTTP response returns, via c.executionCtx.waitUntil: 1. TrialOrchestrator DO `start()` — kicks off the alarm-driven state machine that provisions a project, workspace, and discovery agent session. The DO is idempotent on `start()`, so accidental re-invocations no-op. 2. emitGithubKnowledgeEvents() — hits unauthenticated GitHub REST endpoints (`/repos/:o/:n`, `/repos/:o/:n/languages`, `/repos/:o/:n/readme`) in parallel and emits up to `TRIAL_KNOWLEDGE_MAX_EVENTS` `trial.knowledge` events within ~`TRIAL_KNOWLEDGE_GITHUB_TIMEOUT_MS` each. Surfaces description, primary language, stars, topics, license, language breakdown, and README first paragraph so the SSE stream shows activity within ~3s while the VM provisions in the background. Both helpers fully swallow errors — an orchestrator dispatch failure or GitHub rate-limit hit never blocks the response or crashes the Worker. All knobs are env-configurable per Constitution Principle XI: - TRIAL_KNOWLEDGE_GITHUB_TIMEOUT_MS (default 5000) - TRIAL_KNOWLEDGE_MAX_EVENTS (default 10) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(trial): cover orchestrator dispatch, bridge, and GitHub knowledge probe Adds four categories of behavioral tests for the trial onboarding wiring: 1. trial-create.ts.test.ts (+2 cases) - Asserts TrialOrchestrator.start() is dispatched via waitUntil with trialId, repoOwner, repoName, and canonical repoUrl. - Asserts a rejecting start() does NOT propagate — the HTTP response still returns 201 (fire-and-forget contract). - Updates makeEnv() to stub TRIAL_ORCHESTRATOR + TRIAL_EVENT_BUS bindings and introduces makeExecutionCtx() helper. - Also adds a graceful-fallback in create.ts so routes that run without a Worker executionCtx (unit tests) still complete instead of 500-ing on Hono's "This context has no ExecutionContext" throw. 2. trial-github-knowledge.test.ts (new, 5 cases) - Happy path: verifies description, primary language, stars, topics, license, language breakdown, and README paragraph are all emitted. - TRIAL_KNOWLEDGE_MAX_EVENTS cap is enforced. - Total network failure → 0 events, no throw. - Non-2xx repo metadata response → 0 events, no throw. - emitTrialEvent rejection → no throw (last line of defense). 3. trial-orchestrator.test.ts (new, 4 cases) - start() persists initial state with currentStep='project_creation' and schedules an alarm. - start() is idempotent — second call with same input is a no-op and does not re-schedule the alarm. - alarm() on a completed state is a terminal no-op. - alarm() emits trial.error and marks completed when the overall timeout budget is exceeded. 4. trial-bridge.test.ts (new, 9 cases) - bridgeAcpSessionTransition: no-ops on non-trial projects, emits trial.ready on 'running' with ws-{id}.{BASE_DOMAIN} URL, emits trial.error on 'failed', no-ops on other transitions, swallows emitter errors. - bridgeKnowledgeAdded / bridgeIdeaCreated: no-op on non-trial, emit correct event shape when trial exists, swallow errors. All 3,793 tests pass; typecheck clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(trial): document TrialOrchestrator + GitHub knowledge fast-path Adds an "Orchestrator and Fast-Path Knowledge" section to the trial configuration guide covering the two fire-and-forget background tasks dispatched from POST /api/trial/create (TrialOrchestrator DO and the GitHub REST knowledge probe) plus the ACP/MCP event bridge, with tunables tables for both. Also records the change in CLAUDE.md "Recent Changes" and marks the corresponding checklist items in the task file. * style(trial): sort imports per eslint rules Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(trial): emit trial.started event from orchestrator start() The SSE stream's first real event must be `trial.started` so the frontend can transition out of the "Warming up..." empty state. Without it, viewers sat on the placeholder until `trial.progress` or `trial.knowledge` arrived — which could be 3-5s later. Added unit test asserting `emitTrialEvent` is called exactly once with type='trial.started' and the expected shape. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(trial): capability test chaining start() + alarm() through event bus Addresses task-completion-validator HIGH finding #2: no capability test exercised the full orchestrator state machine through the event bus seam. Existing per-method tests covered each transition in isolation but did not chain them. New test drives: start() → persist + setAlarm + emit trial.started → (simulate expired budget) → alarm() → mark failed + emit trial.error The `emitTrialEvent` mock is the event-bus seam; its downstream is already covered by tests/unit/routes/trial-events.test.ts which verifies the bus → SSE stream path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore(trial): archive orchestrator wire-up task Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(trial): cover alarm() retry/backoff + step handler invariants Addresses test-engineer review HIGH findings #1 and #2 (partial). Finding #1 — alarm() retry/backoff: Added 4 tests driving the step-error catch branches via a `./steps` vi.mock. Covers transient-error + retries-remaining (increments counter and schedules backoff, no failTrial), permanent-error (immediate failTrial regardless of budget), transient-error with retries exhausted (promotes to failTrial), and the null-state guard (alarm fires before start()). Finding #2 — step handlers: New `trial-orchestrator-steps.test.ts` covers the two highest-value invariants that don't need D1/DO plumbing mocks: - handleRunning marks state.completed = true - handleDiscoveryAgentStart throws permanent on missing IDs - handleDiscoveryAgentStart is idempotent when session already linked Broader per-handler coverage (project_creation / node_selection / node_provisioning / node_agent_ready / workspace_creation / workspace_ready) tracked in tasks/backlog/2026-04-19-trial-orchestrator-step-handler-coverage.md — those paths require mocks for drizzle + node-agent + project-data services and are out of scope for this PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(trial): remove hardcoded BASE_DOMAIN fallback + extract heartbeat skew constant Addresses constitution-validator findings: HIGH — bridge.ts:41 had `env.BASE_DOMAIN || 'workspaces.example.com'` fallback. BASE_DOMAIN is a non-optional binding; a misconfiguration that let it be empty would silently generate workspace URLs pointing at workspaces.example.com instead of failing loudly. Removed the fallback. MEDIUM — steps.ts had a hardcoded `30_000` heartbeat-skew window. Extracted to DEFAULT_TRIAL_ORCHESTRATOR_HEARTBEAT_SKEW_MS (shared), TRIAL_ORCHESTRATOR_HEARTBEAT_SKEW_MS env override, getHeartbeatSkewMs() getter on the DO, threaded through TrialOrchestratorContext. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(trial): per-IP rate limit on POST /api/trial/create + SSE injection guard Addresses security-auditor HIGH findings: 1. Rate limit on POST /api/trial/create (was missing) - New rateLimitTrialCreate() factory (useIp=true, keyPrefix=trial-create) - Default 10 req/hr, configurable via RATE_LIMIT_TRIAL_CREATE env var - Tighter than the general anonymous bucket because each trial create allocates a Durable Object, fires ~4 GitHub API calls, and consumes a monthly trial slot - Mounted per-route in create.ts so the limiter sees request env - Regression test exercises 429 path with IP-scoped KV window 2. SSE event-name sanitization in formatSse() - Strips CR/LF to prevent SSE-frame injection if a future caller ever bypasses the TrialEvent discriminated union via `as never` casts or dynamic event names - Function now exported for direct testing - New trial-events-format.test.ts covers: happy path stable shape, CR/LF strip on hostile event name (single event frame survives), and JSON data escaping for embedded newlines * fix(trial): switch TrialOrchestrator to new_sqlite_classes + drop premature status gate Addresses cloudflare-specialist HIGH findings: 1. wrangler.toml v9 migration: new_classes -> new_sqlite_classes Cloudflare recommends SQLite-backed storage for new DO classes; the KV-style ctx.storage.put() API works identically on both backends but SQLite is the future-forward choice. TrialOrchestrator has not yet been deployed to any environment (introduced in this PR chain), so flipping the migration type is safe. 2. handleNodeProvisioning: remove synchronous status='running' gate After provisionNode() returns, async-IP providers (Scaleway, GCP) leave the node in 'creating' status — the IP and status='running' flip happens on the first heartbeat. Synchronously requiring status='running' here forced every async-IP trial through the retry/backoff cycle until the heartbeat landed, wasting retry budget and risking permanent failure on slow VM boots. The next step (node_agent_ready) polls heartbeat freshness with its own timeout, which correctly handles both sync (Hetzner) and async (Scaleway/GCP) provisioning paths. Regression test: handleNodeProvisioning advances to node_agent_ready even when provisionNode() leaves the node in 'creating' status. * fix(trial): HMAC-verify fingerprint cookie before reusing UUID Security-auditor HIGH: the old code extracted the fingerprint UUID from the `sam_trial_fingerprint` cookie by splitting on the last `.` without verifying the HMAC signature. An attacker who learned a victim's fingerprint UUID (from logs, a captured cookie, or a prior trial row) could forge `<victimUuid>.anything` to overwrite the `trial-by-fingerprint:<victimUuid>` KV index to point at their own trial. The victim's subsequent OAuth hook lookup would then redirect them to the attacker's trial project. Fix: call verifyFingerprint(existingFp, secret) and only trust the returned UUID. Fall back to crypto.randomUUID() on invalid / missing signature. The secret is already resolved earlier in the same handler (line 195-203). Added regression test in trial-create.ts.test.ts — a forged cookie MUST NOT reuse the victim's UUID; a fresh UUID is minted instead. Updated the "reuses existing fingerprint" test to use a validly-signed cookie. --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* task: move trial-onboarding-ux-polish to active
* feat(trial): polish discovery feed with skeleton timeline + knowledge grouping
- Extract all timing/threshold constants to trial-ui-config.ts (Constitution XI)
- Add STAGE_LABELS map + friendlyStageLabel() for orchestrator stage strings
- TryDiscovery: render StageSkeleton timeline before first SSE event arrives
- TryDiscovery: group rapid trial.knowledge events into a single card
- TryDiscovery: surface "taking longer than usual" hint when SSE silent for 20s
- TryDiscovery: retry-aware terminal error panel
- ChatGate: spinner + aria-busy on send, snap-x chip scroll, anonymous hint copy
- Try: friendlier validation copy, testid hooks for landing audit
* test(trial): cover stage-label mapping + skeleton/error/knowledge-burst Playwright cases
* task: archive trial-onboarding-ux-polish
* fix(trial): SSE replay dedup, accessible badges, larger touch targets
Addresses Phase 5 review findings on the trial onboarding UX polish PR:
CRITICAL — SSE event replay duplication
EventSource silently re-opens after a transport error and the server may
replay any buffered events the client missed. Without dedup, the feed
duplicated every replayed event. Add a composite (`type:at`) dedup set
in TryDiscovery that resets on trialId change.
HIGH — color-only ConnectionBadge (WCAG 1.4.1)
Status was conveyed by background color alone. Prepend a Unicode shape
indicator (●/✕/↺/○) so the meaning is also conveyed in monochrome.
HIGH — knowledge toggle hit area (WCAG 2.5.5)
The "+N more" toggle on grouped knowledge cards was 24px tall — below
the 44px touch-target minimum. Promote to min-h-11 with vertical hit
padding.
MEDIUM — semantic header role + truncation hint
The sticky discovery header used role="banner" (reserved for the
page-wide masthead) and the truncated repo title had no full-text
hover affordance. Switch to role="region" + aria-label and move the
title attribute to the truncating wrapper.
LOW — error CTA touch targets
The "Try again" / "Join the waitlist" Links were below 44px. Promote
to inline-flex min-h-[44px].
Tests
- try-discovery-dedup.test.ts: behavioural coverage of eventDedupKey
and the dedup branch in onEvent (3 scenarios: identical replay,
chronological non-collision, type-vs-timestamp collision).
- try-discovery-build-feed.test.ts: boundary coverage of buildFeed
(within-window merge, exact-boundary `<=` merge, +1ms split,
interleaved non-knowledge break, error-event exclusion).
- ChatGate.test.tsx: spinner visible/hidden behavioural test using a
deferred promise (idle → sending → resolved transitions).
- trial-ui-audit.spec.ts: knowledge-burst test now asserts exactly one
grouped card (was: presence only) and exercises the expand toggle.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(trial): keep StageSkeleton visible after lone trial.started; forward Alert testid
Two narrow fixes uncovered by Playwright visual audit:
1. **StageSkeleton hides too eagerly.** `showSkeleton = events.length === 0`
meant a lone `trial.started` event (which is just an acknowledgement,
not visible progress) caused the "Setting things up" roadmap to vanish
while the user was still staring at a blank screen. Tighten to "no
substantive events yet" — keep showing the roadmap until a real
progress / knowledge / idea / ready / error event arrives.
2. **`Alert` drops `data-testid`.** The shared design-system `Alert`
component didn't declare or forward `data-testid`, so
`<Alert variant="error" data-testid="trial-error-panel">` silently
discarded the prop and the terminal-error Playwright assertion
couldn't find the panel. Add the prop to `AlertProps` and forward it
to the rendered `<div role="alert">`.
All 45 Playwright trial-ui-audit tests now pass across iPhone SE,
iPhone 14, and Desktop projects.
---------
Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
29 tasks
) * task: move trial-events-debug to active * task: instrument trial event bus path for staging triage Add high-signal log.info points at every boundary in the trial event flow so `wrangler tail` can show exactly where the pipeline drops: - create.ts: log dispatch_begin, orchestrator_task.{enter,stub_ready, start_returned}, knowledge_task.{enter,done}, waitUntil_registered - trial-runner.ts:emitTrialEvent — log emit_begin / emit_ok - trial-orchestrator: start.enter, state_put, alarm_set, trial_started_emitted; alarm.enter - trial-event-bus: handleAppend.enter / stored / rejected_closed Pure instrumentation — no behavior change. Will be pared back or removed once the failure mode is identified on staging. * fix(trial): emit unnamed SSE frames so EventSource.onmessage fires Root cause of the zero-events-on-staging incident (2026-04-19): formatSse() wrote named SSE frames ('event: trial.knowledge\ndata: {...}') but the frontend subscribes via source.onmessage, which only fires for the default (unnamed) event. Bytes arrived on the wire — curl saw them — but no frontend-visible event was ever dispatched. Change the SSE serializer to emit unnamed frames ('data: {...}'). The TrialEvent payload itself carries a 'type' discriminator so no information is lost. Update the unit test to lock in the new contract (no 'event:' line) and point at the post-mortem. Also fix a latent eventsUrl contract mismatch: POST /api/trial/create returned '/api/trial/events?trialId=X' while the real route is '/api/trial/:trialId/events'. The frontend builds its own URL so end-users weren't affected, but the response-field contract was wrong. The previous unit test used toContain() on a substring, masking the drift. See docs/notes/2026-04-19-trial-sse-named-events-postmortem.md. * test(trial): add TrialEventBus → SSE capability test Regression guard for the 2026-04-19 incident. Seeds a trial in KV, appends events directly on the TrialEventBus DO (identical to emitTrialEvent()), opens the SSE stream via SELF.fetch with a valid fingerprint cookie, reads the raw stream bytes, and asserts: - HTTP 200 + correct content-type - At least one 'data: {...}' frame - No 'event:' line anywhere (the regression guard) - The parsed JSON payload round-trips through the bus intact Also add TRIAL_EVENT_BUS DO binding and TRIAL_* env bindings to the workers vitest config so this test (and future trial-related worker tests) can construct stubs. Note: the existing workers test pool is currently broken on this branch and base (miniflare WebSocket exits unexpectedly on all 6 pre-existing worker tests too — not caused by this change). Once the pool is unblocked this test runs as-is. * docs(trial): post-mortem + rule 13 ban curl-only SSE verification Post-mortem covers what broke, the two-layer contract mismatch (named SSE events + wrong eventsUrl shape), timeline, why it wasn't caught (no E2E capability test, curl used instead of a real browser, frontend test path not exercised), the class of bug, and the process fixes landing in this PR. Update rule 13 (staging verification) to explicitly ban curl-only verification for browser-consumed SSE/WebSocket streams — curl confirms the byte stream, only a real browser confirms dispatch to onmessage. * task: record root cause + fixes on trial SSE events task * test(trial): update trial-events.test SSE assertion for unnamed frames The integration test for GET /api/trial/:trialId/events was asserting the old named-event contract ('event: trial.ready'). With the formatSse() fix the frame is unnamed; update the assertion to lock in the new contract (data: line present, no event: line). * task: archive trial SSE events debugging task * chore(trial): address review findings on SSE events fix - Add TRIAL_ORCHESTRATOR + TRIAL_COUNTER DO bindings to apps/api/vitest.workers.config.ts (cloudflare-specialist MEDIUM) - CLAUDE.md: prepend 'trial-sse-events-fix' entry to Recent Changes (doc-sync-validator MEDIUM) - Fix broken link in postmortem (tasks/active -> docs/notes) and tick the completed rule-13 follow-up checkbox (doc-sync-validator LOW) - Add cross-reference from .claude/rules/02-quality-gates.md to the rule-13 curl-only SSE-verification ban (doc-sync-validator LOW) - File pre-existing HIGH (AbortController not propagated into busStub.fetch) and MEDIUM (nextCursor persistence) as backlog tasks so they're tracked but don't block this fix PR --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>
…764) * task: move trial orchestrator agent-boot task to active * feat(trial): boot discovery agent on VM + detect real default branch Two bugs blocked the trial demo from working end-to-end: 1. handleDiscoveryAgentStart only created chat + ACP session records but never called createAgentSessionOnNode / startAgentSessionOnNode. The ACP session sat in `pending` forever, never transitioning to `running`, so `trial.ready` never fired. 2. Project defaultBranch + workspace branch were hardcoded to 'main', so trials on master-default repos (e.g. octocat/Hello-World) failed the VM-side `git clone --branch main`. Fix (mirrors TaskRunner's agent-session-step pattern): - Add `defaultBranch`, `mcpToken`, `agentSessionCreatedOnVm`, `agentStartedOnVm`, `acpAssignedOnVm`, `acpRunningOnVm` fields to TrialOrchestratorState for crash-safe idempotency. - `fetchDefaultBranch()` probes GitHub's public API with a 5s AbortController timeout (TRIAL_GITHUB_TIMEOUT_MS override), falls back to 'main' on any failure. Threaded through both `projects.default_branch` and the workspace-side `git clone --branch`. - `handleDiscoveryAgentStart` now runs a 5-step idempotent flow: 1. startDiscoveryAgent (existing) -> chat + ACP session records. 2. createAgentSessionOnNode (new) -> D1 agent_sessions row + VM agent registers the session. 3. generateMcpToken + storeMcpToken (new) -> KV token so the agent can call add_knowledge / create_idea. 4. startAgentSessionOnNode (new) -> VM agent boots the agent subprocess with the discovery prompt + MCP server URL. 5. transitionAcpSession pending -> assigned -> running -> the trial bridge emits `trial.ready` with workspaceUrl. - Trial's synthetic taskId = state.trialId (trials have no tasks row), so MCP rate-limiting keys per-trial. Drop get_instructions from the initial prompt since it'd 404 against the tasks table. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(trial): capability coverage for orchestrator VM agent boot Adds trial-orchestrator-agent-boot.test.ts asserting the 3-step VM boot pattern + ACP pending→assigned→running transitions + idempotency across crash/retry. Updates trial-orchestrator-steps.test.ts for the new nodeId requirement and adds mocks for node-agent/mcp-token/project-data services. Also adds fetchDefaultBranch coverage (master, 404 fallback, network error fallback, idempotent re-entry). Post-mortem at docs/notes/2026-04-19-trial-orchestrator-agent-boot-postmortem.md. Process fix: adds port-of-pattern coverage bullet to .claude/rules/10-e2e-verification.md so a port of TaskRunner's agent-session pattern into a new consumer must assert every step fired. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * task: archive trial orchestrator agent-boot task * docs(trial): add CLAUDE.md Recent Changes + TRIAL_GITHUB_TIMEOUT_MS row * fix(trial): persist defaultBranch before D1 insert + redact mcpToken in getStatus Cloudflare-specialist review (HIGH): two fixes 1. handleProjectCreation now persists state.defaultBranch before the D1 projects insert. Previously a crash between the D1 write and the DO state persist could cause a retry to re-probe GitHub and resolve a different branch than what had already landed in the projects row. 2. getStatus() now redacts the live mcpToken bearer credential before returning state to any debug/admin caller. The stale comment claiming the DO doesn't store secrets is corrected. * fix(trial): revoke MCP token on failure + redaction test + review doc sync Addresses Phase 5 reviewer findings from the trial-agent-boot PR: security-auditor HIGH: - Revoke state.mcpToken in failTrial() before emitting trial.error. Mirrors TaskRunner's state-machine.ts:265-275 pattern; closes the 4-hour TTL window where a leaked/botched-trial bearer token stays usable. - Document the intentional non-revocation in handleRunning() — orchestrator terminates but the discovery agent still needs the token for MCP calls during the 20-min workspace TTL. - Document the sentinel userId scoping limitation on resolveAnonymousUserId so future trial code remembers that per-user queries do NOT isolate trials from each other; projectId/trialId scoping is mandatory. task-completion-validator MEDIUM: - New test coverage for getStatus() mcpToken redaction (both populated and uninitialized state branches). - New test coverage for failTrial revocation (happy path + KV-error tolerance). doc-sync-validator HIGH: - Add Trial Onboarding section to .claude/skills/env-reference/SKILL.md cross-referencing docs/guides/trial-configuration.md for the full table. * fix(trial): allow multiple trials per repo (partial unique index) The `(user_id, installation_id, repository)` unique index on `projects` prevented more than one anonymous trial per public repo — every trial after the first on the same repo hit a UNIQUE constraint failure during the projects insert in TrialOrchestrator.handleProjectCreation. The DO retried 6 times on alarm backoff then emitted a terminal `trial.error` ("step_failed"), so the user saw the 10% progress event repeat and then fail. Why it slipped through earlier reviews: the capability tests mock D1, so no test exercised the real constraint. Staging verification only tested a single trial per repo. This surfaced the moment a second trial on `octocat/Hello-World` landed during Phase 6 verification. Fix: - Migration 0046 drops + recreates the index as a partial unique index that excludes the trial-sentinel user `system_anonymous_trials`. Real users still can't register duplicate project rows; sentinel-owned trial rows are isolated by `projectId` (per helpers.ts sentinel scope note). - Drizzle schema updated with matching `.where()` clause so codegen and migration stay in sync. Verified locally: trial-orchestrator tests pass (28/28); typecheck clean; lint clean (no new warnings). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
trial.ready is a provisioning milestone (workspace is up), not a signal that discovery is complete. The discovery agent continues producing trial.knowledge and trial.idea events after the workspace is provisioned. Changes: - Event bus: only auto-close on trial.error, not trial.ready - Frontend: keep EventSource open after trial.ready with a 3-minute grace timer (TRIAL_DISCOVERY_STREAM_TIMEOUT_MS) for late-arriving discovery events - Header shows "Discovering <repo>…" while stream is still open after trial.ready, then "Ready: <repo>" after stream closes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…icons - Add TrialAgentActivityEvent type and bridgeAgentActivity() to pipe agent messages/tool calls into the trial SSE stream - Hook message persistence path to emit trial.agent_activity events - Render agent activity cards in the feed (grouped, showing tool names) - Replace misleading "Workspace ready — chat below" with informative message about agent analyzing repository - Replace emoji icons (📎, ★) with lucide-react icons (BookOpen, Lightbulb, Brain, Wrench, Terminal) matching platform design - Add auto-scroll to bottom on new events (scrollIntoView smooth) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Deduplicate consecutive progress events with the same stage in the feed — the orchestrator re-emits keepalive progress while waiting for the agent, creating visual spam (3x "Starting the agent" at 70%) - Clean up agent activity text: strip XML tags, collapse JSON blobs, add line-clamp-2 for overflow - Change "AGENT WORKING..." from uppercase to normal case - Add cleanActivityText() helper for readable tool output summaries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…into sam/trial-onboarding-mvp
Merge sam/trial-discovery-stream-fix into trial MVP branch, bringing: - Auto-scroll to bottom on new events - Agent activity cards grouped in feed with Lucide icons - Progress card deduplication and text cleanup - Stream stays open after trial.ready (agent continues producing events) - Default model switched to Qwen 3 30B Update trial-event-bus test to match new behavior: trial.ready no longer closes the bus since the discovery agent continues producing knowledge and idea events after workspace provisioning. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add AI usage section to the admin analytics dashboard, powered by the AI Gateway Logs API. Shows token usage, estimated cost, trial vs. authenticated breakdown, per-model metrics, and daily trends. Backend: - New admin endpoint GET /api/admin/analytics/ai-usage?period=7d queries AI Gateway logs with pagination and aggregates by model/day - AI proxy now tags requests with projectId and trialId in cf-aig-metadata for trial usage attribution - Configurable via AI_USAGE_PAGE_SIZE, AI_USAGE_MAX_PAGES env vars Frontend: - AIUsageChart component with KPI cards, stacked bar chart (tokens by model), daily usage area chart, and model breakdown table - Integrated into admin analytics dashboard above DAU chart - Graceful fallback if AI Gateway is not configured (catch + null) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…stics The CF AI Gateway Logs API uses `order_by_direction` (not `direction`) for sort order, and error responses now include the upstream body for easier debugging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Cloudflare AI Gateway Logs API enforces a maximum per_page of 50. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(trial): address review findings from trial onboarding subagents Security and correctness fixes from 7 specialist reviewers: CRITICAL: - Fix cookie domain mismatch: claim.ts clearClaimCookie and oauth-hook.ts buildClaimCookie now pass domain from BASE_DOMAIN (matching create.ts) HIGH: - TrialEventBus DO: persist `closed` flag to storage so it survives eviction - AI proxy: sanitize error bodies — log raw errors server-side, return generic messages to clients (prevents internal URL/config leakage) - Admin AI usage: sanitize CF API error responses the same way - SSE events endpoint: add per-IP rate limiting (30 req/5min via KV) - Deploy pipeline: forward ANTHROPIC_API_KEY_TRIAL as optional Worker secret - sync-wrangler-config: inject ENVIRONMENT var into generated env sections - Remove hardcoded DEFAULT_GATEWAY_ID; require AI_GATEWAY_ID from env MEDIUM: - Cron collision: move trial counter rollover from 03:00 to 05:00 UTC (avoids collision with daily analytics forward job at 03:00) - Replace magic number in create.ts with DEFAULT_TRIAL_CLAIM_TTL_MS constant - Add trial secrets to secrets-taxonomy.md and trial-configuration.md - Add comprehensive trial + AI proxy env vars to .env.example - Fix test mocks: add ctx.storage to TrialEventBus tests, add KV to SSE tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(trial): address CTO review — 6 quality improvements 1. Reject unknown IP: SSE rate limit now returns 400 when no client IP header is present, instead of sharing a single "unknown" bucket across all headerless clients. CF-Connecting-IP is always present on Workers. 2. Document KV rate limit trade-off: added inline comment explaining why KV's non-atomic read-modify-write is acceptable here (storm prevention, not exact enforcement) vs DO-based counters for credential rotation. 3. Clean up formatSse: removed unused _eventName parameter that gave the false impression the event name was being used. Updated all call sites and tests. 4. Cookie domain consistency test: new regression test suite asserting that buildClaimCookie, clearClaimCookie, and buildFingerprintCookie produce matching Domain= attributes. Explicitly demonstrates the bug where clearing without a domain fails to delete a domain-scoped cookie. 5. AI_GATEWAY_ID self-hoster safe: returns an empty summary (zero counts) when AI_GATEWAY_ID is not configured, instead of throwing. Self-hosters who don't use AI Gateway get a clean "no data" admin dashboard. 6. Fix .env.example cron default: TRIAL_CRON_ROLLOVER_CRON now shows "0 5 1 * *" matching the actual default after the collision fix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Resolves package.json version conflict (take main's newer deps) and fixes simple-import-sort/exports error in packages/shared/src/constants/index.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Autofix export sort in apps/web/src/lib/api/index.ts - Move useMemo before early return in AIUsageChart (rules-of-hooks) - Prefix unused anthropicModels with _ in staging test - Add FILE SIZE EXCEPTION comments for TryDiscovery.tsx and steps.ts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.




Summary
Implements the zero-friction URL-to-workspace onboarding MVP from idea
01KPGJQ853C44JEREXWEZS1GQ8. Anonymous visitors paste a public GitHub repo URL, watch a live discovery agent analyze it, and get pre-generated suggestion chips that lead into a full SAM workspace after a 2-click login.Built as a single orchestrated PR via 5 waves (foundation + 4 parallel tracks + integration) against the
sam/trial-onboarding-mvpintegration branch. Not to be merged to main — this is flagged for @raphaeltm manual review before merge and before production configuration is applied.cc @raphaeltm — Configuration Checklist Before Merge
Staging (
sammy.party) — zero manual steps requiredThe deploy pipeline provisions + flips everything automatically:
TRIAL_CLAIM_TOKEN_SECRET— auto-generated by Pulumi (infra/resources/secrets.ts), stored encrypted in the Pulumi R2-backed state, pushed as a Worker secret byconfigure-secrets.sh(commit086f4ded)trials:enabled=truein KV — written by the staging deploy workflow on every run (deploy-reusable.yml+ commitb15ca27cremoving an invalid--remoteflag)TRIAL_LLM_PROVIDER=workers-ai— already wired inwrangler.tomlvarsTRIAL_MODEL=@cf/meta/llama-3.1-8b-instructTRIAL_MONTHLY_CAP=1500sam_anonymous_trialssentinel user — seeded via migration 0043→ Nothing to click on the staging environment. A fresh
workflow_dispatchondeploy-staging.ymlgives you a working trial surface.Production (
simple-agent-manager.org) — one manual step (the key)wrangler secret put ANTHROPIC_API_KEY_TRIAL --env production(separate from platform key)TRIAL_LLM_PROVIDER=anthropic,TRIAL_MODEL=claude-3-5-haiku-latest,TRIAL_AGENT_TYPE=claude-codein production varsTRIAL_MONTHLY_CAPto your preferred prod cap (default 500)pnpm --filter @simple-agent-manager/api exec wrangler kv key put "trials:enabled" "true" --binding KV --env productionsam_anonymous_trialssentinel user exists on prod D1trial_counterKV namespace +TrialCounterDO bindings exist in prod wrangler envCookies
TRIAL_CLAIM_TOKEN_SECRET(auto-provisioned on staging, manual on prod if desired).Kill Switches
trials:enabled=falseto instantly pause trial creation./trycleanly falls back to "Trials are paused" — verified on staging.TRIAL_MONTHLY_CAP=0is also a hard stop.What Shipped
Wave 0 — Foundation (
e253c08e)packages/shared/src/trial.ts) for requests, responses, SSE events, idea shapetrial_projects,trial_waitlist,sam_anonymous_trialssentinel userTrialCounter(monthly cap),TrialEventBus(SSE fan-out)apps/api/src/services/trial/cookies.ts) for fingerprint (7d) and claim (48h) tokensWave 1 Track A — Backend Lifecycle (
4ca29ea6)POST /api/trial/create— validates repo URL, checks kill switch + cap, creates project under sentinel user, starts discovery sessionGET /api/trial/status— enabled + remaining slots + reset date (public, no auth)POST /api/trial/waitlist— cap-exceeded email captureWave 1 Track B — Backend Claim + SSE (
6ba2e101)GET /api/trial/:trialId/events— SSE stream multiplexed fromTrialEventBusDOPOST /api/trial/:trialId/claim— post-OAuth handler that transfers the anonymous project from sentinel user to the newly-signed-in user, validates claim cookieclaim=<trialId>query param round-trip)TRIAL_LLM_PROVIDER+TRIAL_MODELWave 1 Track C — Frontend Discovery (
e8088705)/trylanding page (mobile-first, repo URL input, kill-switch + cap-exceeded fallbacks)/try/:trialIddiscovery feed consuming the SSE event stream/try/cap-exceeded+/try/waitlist/thankspagesApp.tsxWave 1 Track D — Frontend Chat Gate (
1114c8fc)ChatGatecomponent: suggestion chip carousel + textarea + send buttonLoginSheetmodal triggering GitHub OAuth with claim cookie preserveduseTrialDrafthook: localStorage persistence of the draft across the OAuth round-tripuseTrialClaimhook: post-login auto-submit of the stashed draft to the claimed project's chatWave 2 — Integration, Automation, and Live Fix
sam/trial-onboarding-mvp. Two conflicts resolved:apps/api/src/env.ts— kept both Track A + Track B TRIAL_* env vars.apps/web/src/components/trial/ChatGate.tsx— kept Track D's real implementation; adapted Track C'sTryDiscoveryto Track D'sTrialIdeacontract +onAuthenticatedSubmitcallback.086f4ded): addedinfra/resources/secrets.tsentry that auto-generatesTRIAL_CLAIM_TOKEN_SECRETvia@pulumi/random, and wiredconfigure-secrets.shto push it as a Worker secret. No manualwrangler secret puton staging ever.086f4ded+b15ca27c): added a conditional step to.github/workflows/deploy-reusable.ymlthat writestrials:enabled=trueto KV on every staging deploy (and only staging). Initial attempt used--remote, which is not a valid flag forwrangler kv key put— removed inb15ca27c.db1d6332): Track A was persisting new trials to D1 only, while Track B readers (events.ts,claim.ts,trial-runner.ts) look up trials in KV viareadTrial(). Every SSE connection 404'd with"Trial not found". Fix mirrors the trial to KV inPOST /api/trial/createafter the D1 insert, before issuing cookies, with rollback on KV failure (D1 row deleted,TrialCounterslot released).writeTrial()also hardened to skip thetrial-by-project:index whenprojectIdis empty (would otherwise collide all pending trials on a single key). Added regression test assertingKV.put("trial:<id>", ...)is invoked on the happy path.Non-negotiable Constraints Verified
GITHUB_REPO_URL_REGEXin shared schemasChatGatetriggersLoginSheeton any send attempt by an anonymous visitorTrialCounterDO +TRIAL_ENABLEDenv varpackages/shared/src/trial.tsprojects.userId; anonymous projects owned bysam_anonymous_trialsuntil claimedTRIAL_CLAIM_TOKEN_SECRETLocal Quality Gates
pnpm typecheck— clean across all packagespnpm lint— 0 errorswriteTrialregression test)Staging Deployments
c2780059b15ca27cUnknown argument: remote— fixed by removing--remoteflagdb1d6332Staging Verification (Playwright + curl, live app)
TRIAL_ENABLED=trueon staging, end-to-end happy path exercised:GET /api/trial/status{"enabled":true,"remaining":1500,"resetsAt":"2026-05-01"}✅POST /api/trial/createwithhttps://github.com/sindresorhus/is201withSet-Cookie: sam_trial_fingerprint=…+sam_trial_claim=…✅GET /api/trial/:trialId/eventsvia real cookiesHTTP/2 200,content-type: text/event-stream,: connectedheartbeat ✅/trylanding form submission on mobile 375×667/try/:trialId, ChatGate renders "Live" status, feed waits for events, zero console errors ✅Screenshots:
trial-sse-live-mobile.png,trial-sse-live-desktop.png(in.codex/tmp/playwright-screenshots/).Regression spot-check
/dashboardrenders, project list loads, 0 console errors/health→200 healthyWhat was NOT verified end-to-end
The OAuth claim + post-login auto-submit leg (
chat gate → login sheet → GitHub OAuth → /api/trial/:trialId/claim → stashed draft replay) requires a real GitHub OAuth round-trip with a human. All individual components have unit + integration coverage; the OAuth leg is gated behind a real sign-in and deferred to Raphaël's manual review.Review Status
Full specialist review was not dispatched because this PR is flagged for manual review by @raphaeltm before merge. The
needs-human-reviewlabel is applied. Raphaël will decide whether to dispatch additional reviewers, flip production config, and proceed to merge.Do NOT Merge Yet
mainuntil Raphaël has reviewed the configuration checklist.🤖 Generated with Claude Code